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<^ . Abstract 

We study the mixing time of a systematic scan Markov chain for sampling from 
the uniform distribution on proper 7-colourings of a finite rectangular sub-grid 
of the infinite square lattice, the grid. A systematic scan Markov chain cycles 
through finite-size subsets of vertices in a deterministic order and updates the 
i -pH ■ colours assigned to the vertices of each subset. The systematic scan Markov chain 

that we present cycles through subsets consisting of 2x2 sub-grids and updates the 
colours assigned to the vertices using a procedure known as heat-bath. We give a 
computer-assisted proof that this systematic scan Markov chain mixes in O(logn) 
scans, where n is the size of the rectangular sub-grid. We make use of a heuristic to 
compute required couplings of colourings of 2x2 sub- grids. This is the first time the 
^vq mixing time of a systematic scan Markov chain on the grid has been shown to mix 

for less than 8 colours. We also give partial results that underline the challenges of 
proving rapid mixing of a systematic scan Markov chain for sampling 6-colourings 
of the grid by considering 2x3 and 3x3 sub- grids. 
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1 Introduction 

This paper is concerned with sampling from the uniform distribution, n, on the set 
of proper g-colourings of a finite-size rectangular grid. A q-colouring of a graph is an 
assignment of a colour from a finite set of q distinct colours to each vertex and we say that 
a colouring is a proper colouring if no two adjacent vertices are assigned the same colour. 
Proper g-colourings of the grid correspond to the zero-temperature anti-ferromagnetic 
g-state Potts model on the square lattice, a model of significant importance in statistical 
physics (see for example Salas and Sokal [E]). 

Sampling from 7r is computationally challenging, however it remains an important task 
and it is frequently carried out in experimental work by physicists by simulating some 
suitable random dynamics that converges to n. Ensuring that a dynamics converges to 
7r is generally straight forward, but obtaining good upper bounds on the number of steps 
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required for the dynamics to become sufficiently close to it is a much more difficult prob- 
lem. Physicists are at times forced to "guess" (using some heuristic methods) the number 
of steps required for their dynamics to be sufficiently close to the uniform distribution in 
order to carry out their experiments. By establishing rigorous bounds on the convergence 
rates (mixing time) of these dynamics computer scientists can provide underpinnings for 
this type of experimental work and also allow a more structured approach to be taken. 

Providing bounds on the mixing time of Markov chains is a well-studied problem in 
theoretical computer science. However, the types of Markov chains frequently considered 
by computer scientists do not always correspond to the dynamics usually used in the 
experimental work by physicists. In computer science, the mixing time of various types 
of random update Markov chains have been frequently analysed; notably on the grid by 
Achlioptas, Molloy, Moore and van Bussel pQ and Goldberg, Martin and Paterson [9]. 
We say that a Markov chain on the set of colourings is a random update Markov chain 
when one step of the the process consists of randomly selecting a set of vertices (often 
a single vertex) and updating the colours assigned to those vertices according to some 
well-defined distribution induced by n. Experimental work is, however, often carried out 
by cycling through and updating the vertices (or subsets of vertices) in a deterministic 
order. This type of dynamics has recently been studied by computer scientists in the 
form of systematic scan Markov chains (systematic scan for short). For results regarding 
systematic scan see for instance Dyer, Goldberg and Jerrum [51 H] and Pedersen [T2] 
although these papers are not considering the grid specifically. It is important to note 
that systematic scan remains a random process since the method used to update the 
colour assigned to the selected set of vertices is a randomised procedure drawing from 
some well-defined distribution induced by n. 

In Section [3] we present a computer assisted proof that systematic scan mixes rapidly 
when considering 7-colourings of the grid. Previously eight was the least number of 
colours for which systematic scan on the grid was known to be rapidly mixing, due to 
Pedersen [12], a result which we hence improve on in this paper. We will make use of 
a recent result by Pedersen [12] to prove rapid mixing of systematic scan by bounding 
the influence on a vertex (note that the literature traditionally talks about sites rather 
than vertices). We will provide bounds on this influence parameter by using a heuristic 
to mechanically construct sufficiently good couplings of proper colourings of a 2x2 sub- 
grid. We will hence use a heuristic based computation in order to establish a rigorous 
result about the mixing time of a systematic scan Markov chain. Finally, in Section HI 
we consider the possibility of proving rapid mixing of systematic scan for 6-colourings of 
the grid by increasing the size of the sub-grids. We give lower bounds on the appropriate 
influence parameter that imply that the proof technique we employ does not imply rapid 
mixing of systematic scan for 6-colourings of the grid when using 2x2, 2x3 and 3x3 
sub- grids. 

1.1 Preliminaries and statement of results 

Let Q — {1, . . . , 7} be the set of colours and V — {1, . . . , n} the set of vertices of a finite 
rectangular grid G with toroidal boundary conditions. Working on the torus is common 
practice as it avoids treating several technicalities regarding the vertices on the boundary 
of a finite grid as special cases and hence lets us present the proof in a more "clean" way. 
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We point out however that these technicalities are straightforward to deal with (more 
on this in Section [2]). We formally say that a colouring a of G is a function from V to 
Q. Let Q + be the set of all colourings of G and Q be the set of all proper g-colourings. 
Then the distribution 7T, described earlier, is the uniform distribution on Q. If o G Q + 
is a colouring and j G V is a vertex then <7j denotes the colour assigned to vertex j in 
colouring a. Furthermore, for a subset of vertices AC]/ and a colouring a G £l + we let 
<j\ denote the colouring of the vertices in A under a. For each vertex j G V, let Sj denote 
the set of pairs (<x, r) G Q + x f2 + of colourings that only differ on the colour assigned to 
vertex j, that is = Tj for all 2 7^ j. 

Let .M be a Markov chain with state space Q + and stationary distribution n. Suppose 
that the transition matrix of Ai is P. Then the mixing time from an initial colouring 
a G f2 + is the number of steps, that is applications of P, required for Ai to become 
sufficiently close to 7r. Formally the mixing time of Ai from an initial colouring a G Q + 
is defined, as a function of the deviation e from stationarity, by 

Mix a (M,e) = min{t > : d TV (P*(o-, •), tt) < e}, (1) 

where 

drv(0i, 02) = \J2 \ 9 &) ~ d M = max \e x {A) - 9 2 (A)\ (2) 

i ~ 

is the total variation distance between two distributions 9\ and 62 on The mixing 
time Mix(M, e) of At is then obtained my maximising over all possible initial colourings 

Mix(.M , e) = max Mix ff (M,e). (3) 

We say that M. is rapidly mixing if the mixing time of Ai is polynomial in n and log(e _1 ). 

We will make use of a recent result by Pedersen [12J to study the mixing time of a 
systematic scan Markov chain for 7-colourings of the grid using block updates. We need 
the following notation in order to define our systematic scan Markov chain. Define the 
following set = {Oi, . . . , O m } of m blocks. Each block C V is a 2x2 sub-grid and m 
is the smallest integer such that IJfcli ®fc = V ■ ^ or an y block 0^ and a pair of colourings 
c,r G Q + we write "cr = r on 0^" if (7j = Tj for each % G 0^ and similarly "a = r off 
0^" if <7j = Ti for each i G V \ 0fc. We also let 90fc denote the set of vertices in V \ 0^ 
that are adjacent to some vertex in and we will refer to dQk as the boundary of ©£. 
Note from our previous definitions that o-Q Qk denotes the colouring of the boundary of 
0^ under a colouring a G f2 + . We will refer to 0",9e fc as a boundary colouring. Finally we 
say that a 7-colouring of the 2x2 sub-grid agrees with a boundary colouring o~QQ k if 
(1) no adjacent sites in 0^ are assigned the same colour and (2) each vertex j G 0^ is 
assigned a colour that is different to the colours of all boundary vertices adjacent to j. 

For each block 0^ and colouring a G Q + let f2fc(er) be the subset of Q + such that 
if a' G rifc(cr) then a' = a off 0^ and cr@ fc agrees with <Tae fc - Let vr fc (a) be the uniform 
distribution on f4(cr). We then define to be the transition matrix on the state space 
Q + for performing a so-called heat-bath move on 0^. A heat-bath move on a block 0^, 
given a colouring a G is performed by drawing a new colouring from the distribution 
TTk(cr). Note in particular that applying P^ to a colouring cr G f2 + results in a colouring 
a' G f2 + such that a' = a off 0^ and the colouring crL of is proper and agrees with 
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the colouring <7g of the boundary of ©& (which is identical to eraej. We formally define 
the following systematic scan Markov chain for 7-colourings of G, which systematically 
performs heat-bath moves on 2x2 sub-grids, as follows. It is worth pointing out that this 
holds for any ordering of the set of blocks. 

Definition 1. The systematic scan dynamics for 7-colourings of G is a Markov chain 
-Mgrid with state space f2 + and transition matrix P gr id = II^P^. 

It can be shown that the stationary distribution of -M gr id is it by considering the 
construction of P gr id- It is customary to refer to one application of P gri d (that is updating 
each block once) as one scan. One scan takes Ylk \®k\ vertex updates and by construction 
of © this sum is clearly of order 0(n). 

We will prove the following theorem and point out that this is the first proof of rapid 
mixing of systematic scan for 7-colourings on the grid. 

Theorem 2. Let -M gr id be the Markov chain from Definition^ on 7-colourings of G. 
Then the mixing time of At grid is 

Mix(At grid , e) < 63 log^ 1 ) . (4) 
1.2 Context and related work 

We now provide an overview of previous achievements for colourings of the grid. Previ- 
ously it was known that systematic scan for g-colourings on general graphs with maximum 
vertex degree A mixes in O(logn) scans when q > 2 A due to Pedersen [12]. That result 
is a hand-proof and uses block updates that updates the colour at each endpoint of an 
edge during each step. Earlier Dyer et al. [I] had shown that a single-site systematic 
scan Markov chain (where one vertex is updated at a time) mixes in O(logn) scans when 
q > 2A and in 0(n 2 logn) scans when q = 2A. It is hence well-established that system- 
atic scan is rapidly mixing for g-colourings of the grid when q > 8 but nothing has been 
known about the mixing time for smaller q. The results of both Pedersen [12] and Dyer 
et al. [1] bound the mixing time by studying the influence on a vertex. We will use that 
technique in this paper as well, however we will construct the required couplings using a 
heuristic. We defer the required definitions to Section [5] which also contains the proof of 
Theorem [2j 

Recent results have revealed that, in a single-site setting, one is not restricted use 
the total influence on a vertex when analysing the mixing time of systematic scan by 
bounding influence parameters. In a single-site setting one can define an nxn-matrix 
whose entries are the influences that all vertices have on each other. Hayes fTD] has 
shown that providing a sufficiently small upper bound on the spectral gap of this matrix 
implies rapid mixing of both systematic scan and random update. Dyer, Goldberg and 
Jerrum [6] furthermore showed that an upper bound on any matrix norm also implies 
rapid mixing of both types of Markov chains. These techniques are however not known 
to apply to Markov chains using block moves. See the PhD thesis by Pedersen [T.3J for 
more comprehensive review of the above results and for the difficulties in extending them 
to cover block dynamics. 

As random update Markov chains have received more attention than systematic scan 
we also summarise some mixing results of interest regarding g-colourings of the grid (recall 
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that a random update Markov chain selects randomly a subset of sites to be updated at 
each step). Achlioptas et al. pQ give a computer-assisted proof of mixing in 0(n\ogn) 
updates when q = 6 by considering blocks consisting of 2x3 sub-grids. Our computations 
are similar in nature to the ones of Achlioptas et al. however their computations are not 
sufficient to imply mixing of systematic scan as we will discuss in due course. More 
recently Goldberg, Martin and Paterson [9] gave a hand-proof of mixing in O(nlogn) 
updates when q > 7 using the technique of strong spatial mixing. Previously Salas and 
Sokal [H] gave a computer-assisted proof of the q = 7 result which was also 

implied by another computer-assisted result due to Bubley, Dyer and Greenhill [3] that 
applies to 4-regular triangle-free graphs. Finally it is worth pointing out that, in the 
special case when q = 3, two complementary results of Luby, Randall and Sinclair [TT] 
and Goldberg, Martin and Paterson [8] give rapid mixing of random update. 

2 Bounding the mixing time of systematic scan 

This section will contain a proof of Theorem [2] although the proof of a crucial lemma, 
which requires computer-assistance, is deferred to Section |3j We will bound the mixing 
time of A^g r id by bounding the influence on a vertex, a parameter which we denote by 
a and will define formally in due course. If a is sufficiently small then Theorem 2 from 
Pedersen [T2] implies that any systematic scan Markov chain, whose transition matrices 
for updating each block satisfy two simple properties, mixes in O(logn) scans. For 
completeness we restate this theorem (Theorem [3] below) and in the statement we let 
A4_> denote a systematic scan Markov chain whose transition matrices for each block 
update satisfy the required properties. 

Theorem 3. If a < 1 then the mixing time of M.^ is 

Mix(A^,,)<^^. (5) 
1 — a 

For each block 0^ the transition matrix P^ needs to satisfy the following two prop- 
erties in order for Theorem [3] to apply. 

1. If PW((7, r) > then a = r off 6 fc , and 

2. 7r is invariant with respect to P^K 

It is pointed out in Pedersen |T2] that if P^ is a transition matrix performing a heat-bath 
move then both of these properties are easily satisfied. Furthermore, it is pointed out 
that when Q is the set of proper colourings of a graph, then 7r is the uniform distribution 
on f2 as we require. Since the transition matrices P^ used in the definition of A4 gT id 
perform heat-bath updates we are hence able to use Theorem [3] to bound the mixing 
time of A^grid- 

We are now ready to formally define the parameter a denoting the influence on a 
vertex. For any pair of colourings (a, r) G Si let ^(cr, r) be a coupling of the distributions 
induced by P^(a, •) and pN(r, •), namely 7Ck(o~) and nk(r) respectively. We remind 
the reader that a coupling of two distributions tti and ^ on state space Q + is a joint 
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distribution f2 + x f2 + such that the marginal distributions are 7Ti and tt2- For ease of 
reference we also let pj(^fk(cr, r)) denote the probability that a vertex j G 6^ is assigned 
a different colour in a pair of colourings drawn from some coupling ^(cr, r). We then let 



be the influence of i on j under Finally the parameter a denoting the influence on 
any vertex is defined as 



Pedersen [T2| actually defines a with a weight associated with each vertex, however as 
we will not use weights in our proof we have omitted them from the above account. So, in 
order to upper bound a we are required to upper bound the probability of a discrepancy 
at each vertex j G Qk under a coupling ^(er, T ) of the distributions 7Tk{o~) and tt^t) for 
any pair of colourings (a, r) G 5^ that only differ at the colour of vertex i. Our main 
task is hence to specify a coupling \I/ fc (cr, r) of 7r fc (a) and 7r fc (r) for each pair of colourings 
(a, t) G Si and upper bound the probability of assigning a different colour to each vertex 
in a pair of colourings drawn from that coupling. 

Consider any block Qk and any pair of colourings (a, r) G Si that differ only on the 
colour assigned to some vertex i. Clearly the distribution on colourings of 0^, induced 
by 7T fe (o") only depends on the boundary colouring crgQ k . Similarly, the distribution on 
colourings of 9^, induced by 7Tfc(r) depends only on TgQ k . If i G" dQk then the distributions 
on the colourings of 0^, induced by ^(cr) and iikij), respectively, are the same and we 
let r) be the coupling in which any pair of colourings drawn from \l/fc(o", r) agree 

on 0fc. That is, if the pair (a', r') of colourings are drawn from ^(a, r) then a' = a off 
0/c, t' = r off Q k and a 1 = r' on fc . This gives = for any i G" dQk and j G 

We now need to construct ^(u, r) for the case when % G dQk- For each j G Qk 
we need ^-(^(cr, r)) to be sufficiently small in order to avoid pfj being too big. If the 
p\ ^-values are too big the parameter a will be too big (that is greater than one) and we 
cannot make use of Theorem [3] to show rapid mixing. Constructing ^(cr, r) by hand such 
that Pj(\l/fc(cr, r)) is sufficiently small is a difficult task. It is, however, straight forward to 
mechanically determine which colourings have positive measure in the distributions 7rfc((j) 
and Tikij) for a given pair of boundary colourings crgQ k and TgQ k . From these distributions 
we can then use some suitable heuristic to construct a coupling that is good enough for 
our purposes. We hence need to construct a specific coupling for each individual pair of 
colourings differing only at a single vertex. In order to do this we will make use of the 
following lemma, which is proved in Section [31 

Lemma 4. Let vx, ... ,1)4 be the four vertices in a 2x 2-block and z\, . . . , z 8 be the boundary 
vertices of the block and let the labeling be as in Figure Ui Let Z and Z' be any two 7- 
colourings of the boundary vertices such that Z and Z' agree on each vertex except on z\ . 
Let Hz and ttz 1 be the uniform distributions on proper 7 '-colourings of the block that agree 
with Z and Z' , respectively. For i = 1, ... ,4 let p Vi {^) denote the probability that the 
colour of vertex Vi differ in a pair of colourings drawn from a coupling ^ of nz and ttz> ■ 
Then there exists a coupling ^ such that p Vl (ty) < 0.283, p V2 (^) < 0.079, p V3 (^) < 0.051 
andp V4 (^) < 0.079. 



Pi,j = max Pj (^ k (a,r)) 



(6) 




(7) 
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Figure 1: General labeling of the vertices in a 2x2-block 0^. and the vertices dOk on the 
boundary of the block. 
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Figure 2: A 2x2-block showing all eight positions of a vertex i G <90fc on the boundary 
of the block in relation to a vertex j G 0^ in the block. 



Thus if % G dOk we let ^(cr, t) be the coupling of 7Tfc(cr) and 7rfc(r) that draws the 
colouring of from the coupling \1/ in Lemma HI where Z is the boundary colouring 
obtained from o~de k and Z' is obtained from TgQ k , and leaves the colour of the remaining 
vertices, V\Bfc, unchanged. That is, if the pair (a', r') of colourings are drawn from 
\l/fc(a, r) then a' — a off ©&, r' — r off 0^ and the colourings of 0^ in o' and r' are 
drawn from the coupling \I/ in Lemma H] (see the proof for details on how to construct 
\&). It is straightforward to verify that this is indeed a coupling of 7r fc (a) and 7r fc (r). Note 
that due to the symmetry of the 2x2-block, with respect to rotation and mirroring, we 
can always label the vertices of and <90fc such that label Z\ in Figure Q] represents the 
discrepancy vertex i on the boundary. Hence we can make use of Lemma H] to compute 
upper bounds on the parameters p\y We summarise the - values in the following 
Corollary of Lemma HI Note that due to the symmetry of the block we can assume that 
vertex j G 0^ in the corollary is located in the bottom left corner, as Figure |5] shows. 

Corollary 5. Let 0^. be any 2x2-block, let j G be any vertex in the block and let 
i G <90fc be a vertex on the boundary of the block. Then 

'0.283, if i and j as in Figure 0(a) or (b), 

0.079, if i and j as in Figure [2(c) or (h), 

0.051, if i and j as in Figure [2](e) or (f), 

^0.079, if i and j as in Figure [21(d) or (g). 



Pij = max Pj(*fe(<T,r)) < < 



If i d<dk is not on the boundary of the block then = 0. 

We can then use Corollary [5] to prove Theorem [2j The proof of Theorem [2] is given 
here: 
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Proof of Theorem^ Let a^j = YliPij be the influence on j under 0&. We need akj to 
be upper bounded by one for each block 0& and vertex j G in order to ensure that 
a = maxfc maxjgej, atk,j is less than one. Fix any block 0& and any vertex j G A 
vertex i G dQk on the boundary of the block can occupy eight different positions on the 
boundary in relation to j as showed in Figure [^a)-(h). Recall that we are working on 
the torus, and hence every vertex on the boundary of the block will belong to G. Thus, 
using the bounds from Corollary [5] we have 

a Kj = ^ Pi,j < 2(0.283 + 0.079 + 0.051 + 0.079) = 0.984. (9) 

i 

Then a = max^ max^e* «fcj < max^ 0.984 = 0.984 < 1 and we obtain the stated bound 
on the mixing time of -M gr id by Theorem [31 □ 

We make the following remark. In the proof of Theorem [2] above, we assume that G 
is a finite rectangular grid with toroidal boundary conditions. Hence, every block is a 
2x2-sub-grid and each vertex on the block boundary belongs to V. We note that if G 
is a finite rectangular grid without toroidal boundary conditions then some vertices on 
the boundary dOk of a block 0^ might fall outside G. The sum in Equation ([9]) is over 
boundary vertices i that do belong to V, and hence the number of terms in this sum is 
reduced if some boundary vertices do not belong to V, making a smaller. Furthermore, 
if G is a non-rectangular region of the grid then a block next to the boundary might be 
smaller than 2x2 vertices. Suppose 0& is a block that is smaller than 2x2 vertices. Then 
the vertices that are missing in order to make a full 2x2-block are boundary vertices. 
Suppose i G <90fc belongs to V and %' G dQk does not belong to V. When constructing 
couplings \&fc(cr, t), where (er, r) G Si, we must consider the vertex il as "colourless", 
which would decrease the value of p^. A more rigorous analysis yields that our mixing 
result with seven colours and 2x2-blocks holds for arbitrary finite regions G of the grid. 

Of course we have yet to establish a proof of LemmaHJ and the rest of this paper will be 
concerned with this. Our method of proof uses some ideas of Goldberg, Jalsenius, Martin 
and Paterson [7] in so far as it is computer assisted and we will be focusing on minimising 
the probability of assigning different colours to vertex v± in the constructed couplings. 
We will however be required to construct a coupling on the 2x2 sub-grid, rather than 
establishing bounds on the disagreement probability of a vertex adjacent to the initial 
discrepancy and then extending this to a coupling on the whole block recursively. Our 
approach is similar to the one Achlioptas et al. [1] take, however we do not have the 
option of constructing an "optimal" coupling using a suitable linear program (even when 
feasible) since our probabilities will be maximised over all boundary colourings. The 
crucial difference between the approaches is that Achlioptas et al. |TJ are using path 
coupling (see Bub ley and Dyer |2J) as a proof technique which requires them to bound 
the expected Hamming distance between a pair of colourings drawn from a coupling. This 
in turn enables them to, for a given boundary colouring, specify an "optimal" coupling 
which minimises Hamming distance. We are, however, required to bound the influence of 
i on j for each boundary colouring and sum over the maximum of these influences. The 
reason for this is the inherit maximisation over boundary colourings in the definition of 
p\j as described above. 

Finally it is worth mentioning that providing bounds on the expected Hamming dis- 
tance is similar to showing that the influence of a vertex is small and it is known that 
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this condition implies rapid mixing of a random update Markov chain, see for example 
Weitz [15] . In a single-site setting the condition "the influence of a vertex is small" also 
implies rapid mixing of systematic scan (Dyer et al. [1]), however, in a block setting this 
condition is not sufficient to give rapid mixing of systematic scan (Pedersen [13J), which 
is why we need to bound the influence on a vertex. 

3 Constructing the coupling by machine 

In order to prove Lemma H] we will construct a coupling ^ of ttz and itz> for all pairs of 
boundary colourings Z and Z' that are identical on all boundary vertices but vertex z 1; 
on which Z and Z' differ. For each coupling constructed we verify that the probabilities 
p Vi (ty), i = 1, . . . , 4, are within the bounds of the lemma. The method is well suited to be 
carried out with the help of a computer and we have implemented a program in C to do 
so. Before stating the proof of LemmaH]we will discuss how a coupling can be represented 
by an edge-weighted complete bipartite graph. We make use of this representation of \I/ 
in the proof of the lemma. 

3.1 Representing a coupling as a bipartite graph 

Let S be a set of objects and let W be a set of \S\ pairs (s, w s ) such that s G S and w s > 
is a non-negative value representing the weight of s. Each element s G S is contained 
in exactly one of the pairs in W. If the value w s is an integer (which it is in our case) 
it can be regarded as the multiplicity of s in a multiset. The set W is referred to as a 
weighted set of S. Let tts,w be the distribution on S such that the probability of s is 
proportional to w s , where (s,w s ) is a pair in W. More precisely, the probability of s in 
its,w is P T n sw (s) = w s / J2(tw t )£W Wt - For example, let W be a weighted set of S and let 
S' C S be a subset of S. Assume the weight w s = if s G S\S' and w s = k if s G S', 
where k > is a positive constant. Then tts,w is the uniform distribution on S'. 

The reason for introducing the notion of a weighted set is that it can be used when 
specifying a coupling of two distributions. Let S be a set and let W and W be two 
weighted sets of S such that the sum of the weights in W equals the sum of the weights 
in W. Let w to t denote this sum. That is, u> to t = ^2( s ,w 3 )ew w s = Yl( a ',w' l )eW' w a>- The 
two weighted sets W and W define two distributions tts,w and irs,w on S. We want 
to specify a coupling \f of its,w and tts,w- Let K\s\,\s\ be an edge-weighted complete 
bipartite graph with vertex sets W and W. That is, for each pair (s, w s ) G W there 
is an edge to every pair in W . Every edge e of K\s\\s\ has a weight w e > such that 
the following condition holds. Let (s,w s ) be any pair in W U W and let E be the set 
of all IS 1 ) edges incident to (s,w s ). Then ^2 eeE w e = w s . It follows that the sum of the 
edge weights of all IS") 2 edges in K\s\ t \s\ equals w to t, the sum of the weights in W (and 
W). The idea is that i^|s|,|s| represents a coupling \^ of ns,w and 7Ts,w- m order to 
draw a pair of elements from \1/ we randomly select an edge e in K\s\ \s\ proportional to 
its weight. The endpoints of e represent the elements in S drawn from ns,w and 7rs,w- 
More precisely, the probability of choosing edge e in K\ S \,\s\ with weight w e is w e /w tot . 
If edge e = ((s,w s ), (s',w s ,)) is chosen it means that we have drawn s from its,w and s' 
from tts,Wi the marginal distributions of 
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The bipartite graph representation of a coupling will be used when we construct 
couplings of colourings of 2x2-blocks in the proof of Lemma HI 

3.2 The proof of Lemma [4] 

Here is the proof of Lemma HI 

Proof of Lemma [7} Fix two colourings Z and Z' of the boundary that differ on vertex 
Z\. Let c be the colour of vertex z\ in Z and let d ^ c be the colour of z\ in Z' . Let Cz 
and C z> be the two sets of proper 7-colourings of the block that agree with Z and Z' , 
respectively. Let C + be the set of all 7-colourings of the block. Let Wz and Wz> be two 
weighted sets of C + . The weights are assigned as follows. 

• For the pair (a, w a ) G Wz let the weight w a = \Cz> \ if a G Cz, otherwise let w a = 0. 

• For the pair (a,w a ) G W z > let the weight w a = \Cz\ if o G C Z f, otherwise let 
w a = 0. 

It follows from the assignment of the weights that the distribution itc+,w z ls ^ ne uniform 
distribution on Cz- That is, irc+,Wz = n z- Similarly, irc+,w z > * s ^ ne uniform distribution 
ttz> on Cz 1 - Note that the sum of the weights is \Cz\\Cz>\ in both Wz and W Z '- Then 
a coupling \l/ of iic+,w z an d ^c+,w z , can be specified with an edge-weighted complete 
bipartite graph K = K\c+\,\c+\- For a given valid assignment of the weights of the edges 
of K, making K represent a coupling we can compute the probabilities of having a 
mismatch on a vertex Vi of the block when two colourings are drawn from Let E be 
the set of all edges e = ((a,w a ), (a',w' a ,)) in K such that a and a' differ on vertex Vi- 
Then Pvi (y) = J2eeE w e/\Cz\\C z/ \. 

In order to obtain sufficiently small upper bounds on p Vi {^f) for the four vertices 
v i, . . . , t>4 in the block we would like to assign weights to the edges of K such that much 
weight is assigned to edges between colourings that agree on many vertices in the block. 
In general it is not clear exactly how to assign weights to the edges. For instance, if 
we assign too much weight to edges between colourings that are identical on vertex t> 2 
we might not be able to assign as much weight as we would like to on edges between 
colourings that are identical on vertex v 4 . Thus, the probability of having a mismatch 
on t> 4 would increase. Intuitively a good strategy would be to assign as much weight as 
possible to edges between colourings that are identical on the whole block. This implies 
that we try to assign as much weight as possible to edges between colourings that are 
identical on vertex the vertex adjacent to the discrepancy vertex z\ on the boundary. 
If there is a mismatch on vertex V\ it should be a good idea to assign as much weight 
as possible to edges between colourings that are identical on the whole block apart from 
vertex V\. This idea leads to a heuristic in which the assignment of the edge weights is 
divided into three phases. The exact procedure is described as follows. 

In phase one we match identical colourings. For all colourings o G C + of the block 
the edge e = ((a, w a ), (a,w' a )) in K will be given weight w e = min(w a ,w' a ). That is, we 
maximise the probability of drawing the same colouring a from both 7ic+,w z ari d ^c+,w zl - 

For the following two phases we define an ordering of the colourings in C + . We order 
the colourings lexicographically with respect to the vertex order t> 3 , t> 2 , V4, V\. That is, 
if the seven colours are 1, . . . , 7 the colouring of v 3 , v 2 , v±, V\ will start with 1, 1, 1, 1, 
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respectively. The next colouring will be 1, 1, 1, 2, and so on. This ordering of colourings 
in C + carries over to an ordering of the pairs in Wz and Wz>- That is, we order the pairs 
(a, w a ) in Wz with respect to the lexicographical ordering of a. Similarly we order the 
pairs in Wz 1 - This ordering of the pairs will be important in the next two phases. It 
provides some control of how colourings are being paired up in terms of the assignment 
of the weights on edges between pairs. Edges will be considered with respect to this 
ordering because choosing an arbitrary ordering of the edges would not necessarily result 
in probabilities p Vi {^) that would be within the bounds of the lemma. 

In the second phase we ignore the colour of vertex v% and match colourings that are 
identical on all of the remaining three vertices t>2, ^3 and V4. More precisely, for each 
pair (<T,w a ) G Wz, considered in the ordering explained above, we consider the edges 
e = ((a, uv), (cr',u/,)) where a and a' are identical on all vertices but v\. The edges are 
considered in the ordering of the second component (a',w' a ,) G Wz 1 - We assign as much 
weight as possible to e such that the total weight on edges incident to (a, w a ) G Wz does 
not exceed w a and such that the total weight on edges incident to (a',w' a ,) G Wz> does 
not exceed w' a ,. Note that in the lexicographical ordering of the colourings, vertex v% is 
the least significant vertex and therefore the ordering provides some level of control of 
pairing up colourings that are similar on the remaining three vertices. It turns out that 
the resulting coupling is sufficiently good for proving the lemma. 

In the third and last phase we assign the remaining weights on the edges. As in phase 
two, for each pair (a,w a ) G Wz we consider the edges e = ((a, w a ), (a' , w' a ,)) . The pairs 
and edges are considered in accordance with the ordering explained above. The difference 
between the second and third phase is that now we do not have any restrictions on the 
colourings a and a'. We assign as much weight as possible to e such that the total weight 
on edges incident to (a, w a ) G Wz does not exceed w a and such that the total weight on 
edges incident to (a', w' a ,) G Wz> does not exceed w' a ,. After phase three we have assigned 
all weights to the edges of K and hence K represents a coupling \& of Tiz and 7Tz'- 

From K we compute the probabilities p Vl (ty), p V2 (^), p V3 {^) and p V4 (^) as described 
above. We have written a C-program which loops through all colourings Z and Z' of 
the boundary of the block and constructs the bipartite graph K as described above. 
For each boundary the probabilities p vi ($), p V2 {9), p V3 (^) and p V4 (^) are successfully 
verified to be within the bounds of the lemma. For details on the C-program, see 
http : / /www . esc . liv . ac . uk/ ~markus/ systematicscan/. □ 

4 Partial results for 6-colourings of the grid 

In previous sections we have seen that systematic scan on the grid using 2x2-blocks and 
seven colours mixes rapidly. An immediate question is whether we can do better and 
show rapid mixing with six colours. This matter will be discussed in this section and we 
will show that, even with bigger block sizes (up to 3x3), it is not possible to show rapid 
mixing using the technique of this paper. More precisely, we will establish lower bounds 
on the parameter a for 2x2-blocks, 2x3-blocks and 3x3-blocks. All three lower bounds 
are greater than one and hence we cannot make use of Theorem [3] to show rapid mixing. 
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4.1 Establishing lower bounds for 2x2 blocks 

We start by examining the 2x2-block again but this time with six colours. Lemma H 
provides upper bounds (under any colourings of the boundary) on the probabilities of 
having discrepancies at each of the four vertices of the block when two 7-colourings are 
drawn from the specified coupling. For six colours we will show lower bounds on these 
probabilities under any coupling and a specified pair of boundary colourings. Once again, 
let v i, . . . , t>4 be the four vertices in a 2x2-block and let zi, . . . , Zs be the boundary vertices 
of the block and let the labeling be as in Figure HJ Let Z and Z' be any two 6-colourings 
of the boundary vertices that assign the same colour to each vertex except for Z\. Let 
7Tz and ttz 1 be the uniform distributions on the sets of proper 6-colourings of the block 
that agree with Z and Z', respectively. Let \l/™ m (Z, Z') be a coupling of iiz and ixz 1 
that minimises p Vk (^f). That is, p Vk (^) > Pv k (^™, m (Z, Z')) for all couplings \I/ of tiz and 
7Tz'- Also let p l ™ = maxz t z' Pv k (^™ n (Z, Z')). We can hence say that there exist two 
6-colourings Z and Z' of the boundary of a 2x2 block, that assign the same colour to 
each vertex except for z 1; such that p Vk (*f>) > p° w for any coupling \I/ of ttz and ttz'- We 
have the following lemma, which is proved by computation. 

Lemma 6. Consider 6-colourings of the 2x2-block in Figure LH Then p 1 ™ > 0.379, 
p£ w > 0.107, p l ™ > 0.050 and p 1 ™ > 0.107. 

Proof. Fix one vertex Vk in the block and fix two colourings Z and Z' of the boundary of 
the block that differ only on the colour of vertex z\. Let Cz and Cz> be the two sets of 
proper 6-colourings of the block that agree with Z and Z', respectively. For c = 1, . . . , 6 
let n c be the number of colourings in Cz in which vertex Vk is assigned colour c. Similarly 
let n' c be the number of colourings in Cz> in which vertex Vk is assigned colour c. It is 
clear that the probability that Vk is assigned colour c in a colouring a' drawn from ttz 
is Pi nz (a' Vk = c) = n c /\Cz\- For c = 1,...,6 define m c = n c \Cz'\, Tn' c = n' c \Cz\ and 
M = \C Z \\C Z .\. It follows that Pr^ z « fe = c) = m c /M and Pr^ z ,« fe = c) = m'jM, 
where a' and r' are colourings drawn from -nz and Kz', respectively. Observe that the 
quantities m c , m' c and M can be easily computed for a given pair of boundary colourings. 

Now let ^ be any coupling of iiz and ixz>- It is easy to see that the probability that 
vertex Vk is coloured c in both colourings drawn from \l/ can be at most min(m c , m' c )/M. 
Therefore, the probability of drawing two colourings from ^ such that the colour of vertex 
v k is the same in both colourings is at most Y2 c =i 6 m ^ n i m c rn' c )/M, and the probability 
of assigning different colours to vertex Vk is at least p Vk (^) > 1 — J2 c =i 6 m in( m c m' c )/M. 
We have successfully verified the bounds in the statement of the lemma by maximising 
the lower bound on p Vk (^f) over all boundary colourings Z and Z' for each vertex Vk 
in the block. The computations are carried out with the help of a computer program 
written in C. For details on the program, see http://www.csc.liv.ac.uk/~markus/ 
systematicscan/. □ 

For seven colours, Corollary makes use of Lemma H] to establish upper bounds on 
the influence parameters p\y These parameters are used in the proof of Theorem [2] to 
obtain an upper bound on the parameter a. The upper bound on a is shown to be less 
than one which implies rapid mixing for seven colours when applying Theorem [3j We can 
use Lemma [6] to obtain lower bounds on the influence parameters by completing the 
coupling in a way analogous to the coupling in Corollary [51 This in turn will result in a 
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Figure 3: (a) General labeling of the vertices in a 2x3-block B^, and the vertices dOk on 
the boundary of the block, (b)-(c) All ten positions of a vertex i 6 dOk on the boundary 
of the block in relation to a vertex j e 0^ in the corner of the block. 



lower bound on the parameter a that is greater than one. That is, following the proof of 
Theorem [2] and making use of Lemma El a lower bound on a will be 

a > 2(0.379 + 0.107 + 0.050 + 0.107) = 1.286 > 1. (10) 

Hence we fail to show rapid mixing of systematic scan with six colours using 2x2-blocks. 



4.2 Bigger blocks 

We failed to show rapid mixing of systematic scan with six colours and 2x2-blocks and 
we will now show that increasing the block size to both 2x3 and 3x3 will not be suf- 
ficient either. Lemma [7] below considers 2x3-blocks and is analogous to Lemma El We 
make use of the same notation as for Lemma El only the block is bigger and the label- 
ing of the vertices is different (see Figure [31(a))- Lemma [7] is proved by computation 
in the same way as Lemma El For details on the C-program used in the proof, see 
http : / /www . esc . liv . ac . uk/ ~markus/ systematicscan/. 

Lemma 7. Consider 6-colourings of the 2x3-block in Figure\3(a). Then p l ° w > 0.3671, 
p l ™ > 0.0298, p£ w > 0.0997 and p l ™ > 0.0174. 

We will now use Lemma [7] to show that a > 1 for 2x3 blocks. Let 0^ be any 2x3- 
block and let j e be a vertex in a corner of the block. A vertex i G dQk o n the 
boundary of the block can occupy ten different positions on the boundary in relation to 
j. See Figure [3](b) and (c). We can again determine lower bounds on the influences ■ 
of i on j under from Lemma CD However, Lemma [7] provides lower bounds on only 
when i e dO^ is adjacent to a corner vertex of the block, as in Figure 0(b). If i is located 
as in Figure [S](c) we do not know more than that is bounded from below by zero. 
Nevertheless, the lower bound on a exceeds one. Let akj = p\ j be the influence on j 
under 0^. Following the proof of Theorem [2] and using the lower bounds in Lemma [7] we 
have 

= E it* • E p*j 

i in Fig. i in Fig. EJc) 

> 2(0.3671 + 0.0298 + 0.0997 + 0.0174) = 1.028, (11) 
where we set the lower bound on the second sum to zero. Now, 

a = maxmaxa/c > 1.028 > 1. (12) 

k jG6 fc ' J 
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Figure 4: (a)-(b) General labeling of the vertices in a 3x3-block and two different 
labellings of the vertices <90& on the boundary of the block. The discrepancy vertex 
on the boundary has label Z\. (b)-(c) All twelve positions of a vertex % £ dOk on the 
boundary of the block in relation to a vertex j £ 0^ in the corner of the block. 



Hence we cannot use Theorem |3] to show rapid mixing of systematic scan with six colours 
and 2x3-blocks. It is interesting to note that considering 2x3-blocks was sufficient for 
Achlioptas et al. [lj to prove mixing of a random update Markov chain for sampling 
6-colourings of the grid. 

Lastly, we increase the block size to 3x3 and show that a lower bound on a is still 
greater than one. We have the following lemma which is proved by computation in the 
same way as Lemmas [6] and [7J For details on the C-program used in the proof see 
http : / /www . esc . liv . ac . uk/ ~markus/ systematicscan/. 

Lemma 8. For 6-colourings of the 3x3-block with vertices labeled as in FigureUYa) we 
have p£ w > 0.3537, p[° w > 0.0245, p l ™ > 0.0245 and p l ™ > 0.0071. Furthermore, 
for 6-colourings of the 3x3-block in Figure\$b) we have p£ w > 0.0838, p l ™ > 0.0838, 
p£ w > 0.0138 and p l ™ > 0.0138. 

Note that Lemma [H] provides lower bounds on the probabilities of having a mismatch 
on a corner vertex of the block when the discrepancy vertex on the boundary (labeled z\) 
is adjacent to a corner vertex (Figure Hta)) and adjacent to a middle vertex (Figure H](b)). 
Let 0fc be any 3x3-block and let j £ 0& be a vertex in a corner of the block. A vertex 
i £ <90/c on the boundary of the block can occupy twelve different positions on the 
boundary in relation to j. See Figure 0Jc) and (d). Analogous to Corollary lower 
bounds on the influences p\j of i on j under can be determined from Lemma [HJ Let 
ttfcj = J2iPi,j ^ e the influence on j under 0^. Following the proof of Theorem [2] and 
using the lower bounds in Lemma M we have 

i in Fig. [4fc) i in Fig. [4^d) 

> 2(0.3537 + 0.0245 + 0.0245 + 0.0071) + 

(0.0838 + 0.0838 + 0.0138 + 0.0138) = 1.0148. (13) 

Thus, a = maxfc max Jg e fe ctfej > 1.0148 > 1. Hence, we cannot use Theorem [3] to show 
rapid mixing of systematic scan with six colours and 3x3-blocks. 

A natural question is whether we can show rapid mixing using even bigger blocks. It 
seems possible to do this although the computations rapidly become intractable as the 
block size increases. Already with a 3x3-block the number of boundary colourings we 
need to consider (after removing isomorphisms) is in excess of 10 6 and for each boundary 
colouring there are more than 10 7 colourings of the block to consider. In addition to 
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simply generating the distributions on colourings of the block, the time it would take 
to actually construct the required couplings, as we did in the proof of Lemma HI would 
also increase. Finally when using a larger block size, different positions of vertex j in 
the block need to be considered whereas we could make use of to the symmetry of the 
2x2-block to only consider one position of vertex j in the block. If different positions of 
j have to be considered this has to be captured in the construction of the coupling and 
would likely require more computations. The conclusion is that in order to show rapid 
mixing for six colours of systematic scan on the grid we would most likely have to rely 
on a different approach than the one presented in this paper. 
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